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ABSTRACT 

Phase variation of surface structures occurs in 
diverse bacterial species due to stochastic, high 
frequency, reversible mutations. Multiple genes of 
Campylobacter jejuni are subject to phase variable 
gene expression due to mutations in polyC/G tracts. 
A modal length of nine repeats was detected for 
polyC/G tracts within C. jejuni genomes. Switching 
rates for these tracts were measured using 
chromosomally-located reporter constructs and 
high rates were observed for cj1139 (G8) and 
cj0031 (G9). Alteration of the cj1139 tract from G8 
to G11 increased mutability 10-fold and changed 
the mutational pattern from predominantly inser- 
tions to mainly deletions. Using a multiplex PCR, 
major changes were detected in 'on/off' status for 
some phase variable genes during passage of C. 
jejuni in chickens. Utilization of observed switching 
rates in a stochastic, theoretical model of phase 
variation demonstrated links between mutability 
and genetic diversity but could not replicate 
observed population diversity. We propose that 
modal repeat numbers have evolved in C. jejuni 
genomes due to molecular drivers associated with 
the mutational patterns of these polyC/G repeats, 
rather than by selection for particular switching 
rates, and that factors other than mutational drift 



are responsible for generating genetic diversity 
during host colonization by this bacterial pathogen. 

INTRODUCTION 

Simple sequence repeats (SSR) in bacterial genomes 
provide a mechanism for stochastic variation in expression 
of specific genes and rapid adaptation to environmental 
fluctuations. This population-based adaptive process is 
termed phase variation (PV) and is widespread among 
bacterial pathogens and commensals (1^4). The phase 
variable genes drive changes in the expression or structure 
of surface molecules and hence can influence host adapta- 
tion and virulence. 

SSR can mediate PV due to their high mutation rates 
and reversible mutations. Tetranucleotide repeats mediate 
PV in Haemophilus influenzae and tracts of 17 or more 
repeat units are known to mutate at high rates (>10~ 4 
mutations/division) with an excess of deletions over inser- 
tions (5). Many genes of Neisseria meningitidis and 
Campylobacter jejuni are subject to PV mediated by 
polyG/polyC repeat tracts (6,7). The meningococcal 
mononucleotide repeats generate PV frequencies of 
between 10~ 4 and 10~ 5 with mutability being influenced 
by tract length and mismatch repair (MMR) (8-10). The 
mutational patterns of these repeats have not been clearly 
determined although many switches are constrained as 
the repeats are present in promoter elements. The mutabil- 
ity of the C. jejuni repeats is known to be high but there 
are no accurate measurements of the mutation rates or 
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mutational patterns (11-13). Modelling of PV in H. 
influenzae indicates that mutation rate and mutational 
patterns are critical determinants of population diversity 
and hence of adaptation to environmental fluctuations [(5); 
Palmer et al., in preparation]. Characterization of these 
facets is, therefore, crucial for understanding adaptation 
by and evolution of SSR-mediated PV. 

Campylobacter jejuni is a Gram negative, spiral-shaped 
bacterium which occurs as a commensal in the gastrointes- 
tinal tract of many species particularly birds (14,15). 
Campylobacter jejuni causes a severe gastroenteritis in 
humans and can also trigger autoimmune responses due 
to molecular mimicry between the lipooligosaccharide 
(LOS) and host gangliosides (11,16,17). Contaminated 
chicken meat is a key source of C. jejuni infection for 
humans and is responsible for a large proportion of the 
cases of gastroenteritis caused by this bacterial species. 
Multiple genes of C. jejuni are subject to PV due to 
polyG/C tracts located within the reading frame and 
some of these genes are known to contribute to the 
disease process or to colonization of chickens. For 
example, cjll39 encodes a glucosyltransf erase and 
mediates PV of LOS epitopes responsible for autoimmun- 
ity (11). Another gene, maf4, mediates changes in the 
glycosylation of the flagella causing alterations in 
autoagglutination phenotypes, a potential facilitator of 
host adaptation (18). The phase variable cap A gene is 
required for colonization of chickens but it is unclear if 
PV of this gene influences this phenotype (19). Variation 
in the chicken colonization potential of isogenic strains 
has been noted suggesting a role for PV in this process 
(20). While selection for motility during colonization of 
chickens is known to select for changes in short polyA 
tracts in flgR (21), there is less known about PV due to 
polyG/C tracts. Changes in the phase variable genes have 
been observed as C. jejuni adapts to replication in mice 
(22) and for the G8 tract of cjll39 in broiler chickens (23). 
There has not however been a comprehensive examination 
of the changes occurring in multiple phase variable genes 
during colonization of chickens but there has been specu- 
lation that high PV rates in this species may rapidly drive 
populations to a stable, steady-state combination of ON/ 
OFF states for each gene (13). 

In order to understand the contributions of PV to 
host adaptation by C. jejuni, we determined the 
mutation rates/mutational patterns of the phase variable 
loci of C. jejuni using reporter constructs and examined 
changes in proportions of phase variants during growth 
of C. jejuni populations subject to differing selective envir- 
onments. We then utilized the experimentally-derived 
measurements of PV rates in a stochastic, mathematical 
model to examine whether high switching rates were solely 
responsible for driving fluctuations in proportions of 
phase variants during passage of C. jejuni populations. 

MATERIALS AND METHODS 

Strains and growth conditions 

A non-motile variant of C. jejuni strain NCTC11168 was 
obtained from NCTC (6) and grown on Mueller-Hinton 



agar (MHA) plates supplemented with vancomycin (10 ug/ 
ml) and trimethoprim (5 ug/ml) in a VA500 Variable 
Atmosphere Incubator (Don Whitley, UK) using micro- 
aerophilic conditions (4% oxygen, 10% carbon dioxide, 
86% nitrogen) and at either 37°C or 42°C. A previously 
isolated hypermotile variant (NCTC11168H; (24)) of this 
strain was used for inoculation of chickens. A chicken- 
adapted variant of strain 81-176 was also utilized for in- 
oculation of chickens. Escherichia coli strain DH5a was 
grown on Luria agar plates supplemented with antibiotics 
as required: ampicillin 50 ug/ml, kanamycin 50 ug/ml, 
chloramphenicol 10 ug/ml. 

Construction of reporters 

A reporter construct for cjll39 was constructed by amp- 
lifying either end of this gene with two pairs of primers — 5' 
GG A ATTCC ATATTTT AC ATCTTT ACCC AC / 5' CGG 
TACCAGTATTTGCATTGTAAATTC and 5' GGT 
ACCCGGGATCCAAAAATATCAATTCAAATGG/5' 
CCTGC AG AGA AGGT A ATAATCCTTATG . The 
upstream primers were designed to fuse a full-length 
lacZ gene (3.5 kb) to the cjll39 gene at a location 96-bp 
downstream of the polyG repeat tract resulting in a 
switching mechanism identical to that of the native gene. 
The lacZ gene was obtained as a Kpnl and BamHI 
fragment from pGZ-17R (5). A chloramphenicol cassette 
from pAV35 (25) was inserted into the BamHI site. A 
similar cj0031 reporter construct was constructed using 
primers— 5' AACTGCAGAAATTCTCATCACGGGTA 
G/5' CGGGATCCGCGGTACCGTGTATTTGTTAGA 
AGTG and 5' CGGGATCCAAGATAGAACGCATAG 
G/5'AGCACCTTGTCCTAAGTGAG. In this case the 
lacZ gene was located 13-bp downstream of the repeat 
tract. The G8 tract in the cjl!39-lacZ construct was 
altered to Gil by site-directed mutagenesis using two 
primers — 5'TGGGTGGGGGGGGGGGTAAAATTGA 
TTTGTTGTG and 5'TTTACCCCCCCCCCCACCCAT 
ATCC A A A ATTTTAA . Constructs were inserted into 
the chromosome of C. jejuni strain NCTC11168 by elec- 
troporation and selection for chloramphenicol resistance 
as previously described (26). Transformants were tested 
for insertion of the fusion constructs by PCR using 
a lacZBl, a primer located at the 5' -end of the lacZ 
gene (5), and cjll39up (5'GACCTAAAAAAGCATCA 
CTAA), which is located upstream of the cloned 
fragment of cjll39. 

PV assays 

Campylobacter jejuni reporter constructs were grown on 
MHA plates for three days. Serial dilutions of single 
colonies were prepared in Mueller-Hinton broth (MHB) 
and plated onto MHA plates supplemented with a 1:1000 
dilution of a lOmg/ml X-Gal and IPTG solution 
(Melford). These plates were incubated for 3^1 days and 
the total numbers of colonies and numbers of blue and 
white colonies were counted. These numbers were utilized 
for calculation of PV frequencies and for derivation of PV 
rates as described previously (5). 

Boiled lysates of initial colonies and phase variants were 
prepared in 100 ul of distilled water. Repeat tracts were 
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amplified (see below for reaction conditions) with either 
cjll39for and lacZBl or cj0031for and lacZBl. These 
products were then subject to DNA sequencing. PCR 
products were also subject to a GeneScan analysis, see 
below, using FAM labelled versions of primers cjll39for 
or cj0031for and lacZBl. 

PV rates for the capA gene were detected by colony 
immunoblotting (27). Briefly, C. jejuni colonies were 
transferred to nitrocellulose filters. After blocking for 1 
hour with phosphate buffered saline containing 0.1% 
Tween 20 (PBST), excess colony material was removed 
and filters were blocked for an additional 30min. Filters 
were probed for 2h with a 1:2000 dilution of an 
anti-Cap A rabbit antiserum (19), washed three times 
with PBST, probed for 1 h with 1:2000 dilution of a goat 
anti-rabbit alkaline phosphatase conjugate, washed again 
and developed with a commercial NBT/BCIP solution 
(PerkinElmer). 

Analysis of PV genotypes for C. jejuni 

Oligonucleotides were designed for amplification of the 
repeat tracts of six phase variable genes of strains 
NCTC11168 and 81-176 (Supplementary Table SI). 
One primer of each pair was labelled with a fluorescent 
dye (FAM or VIC) and each amplification product had 
a unique size of between 100 and 500 bp. Boiled lysates 
of colonies were prepared and the repeat tracts were 
amplified using either a combination of either three or 
six pairs of primers. Each amplification contained 1 ul of 
lysate, 1 p.l of 10 x PCR buffer (from Kapa Biosystems 
and containing 15mM MgCl 2 ), 1.8 ul of 25 mM MgCl 2 , 
1 ul of lOmM dNTPs, 1 ul of a primer mix (i.e. contain- 
ing each primer at 2uM), 0.1 ul TAQ DNA polymerase 
(5 units) and 4.1 ul of distilled water. PCR reactions 
were subjected to 25 cycles consisting of 30 s at 94°C, 
30 s at 50°C and 1 min at 72°C. After amplification, re- 
actions were subjected to A-tailing. Thus a 4ul reaction 
mix (0.4 ul of 10 x PCR buffer, 0.4 ul 25 mM MgCl 2 , 
0.05 ul of 5U/ul TAQ DNA polymerase, 3.15 ul of dis- 
tilled water) was added to each reaction and then the 
reactions were incubated for 45 min at 72°C. These PCR 
products were analysed using a GeneScan assay (5). 
Briefly, PCR reactions were diluted 1:5 or 1:10 in dis- 
tilled water and 0.5 ul of diluted PCR reaction was 
mixed with 9.25 ul of deionised formamide and 0.25 ul 
of GSLIZ500 (a size standard from Applied 
Biosystems). These preparations were subjected to elec- 
trophoresis on an ABI3700 and the resultant files were 
analysed with PeakScanner (Applied Biosystems). The 
size in base pairs of the highest peak was designated 
as representative of the sample unless the relative area 
of this and flanking peaks was < 1.2-fold (in most case 
there was a difference of >5-fold). Samples with ratios 
of < 1.2-fold were designated as undetermined and not 
utilized for genotype analysis unless both tracts lengths 
had the same phenotype (i.e. both OFF). These samples 
may represent mixed colonies or replication slippage 
during PCR amplification. 

A sub-set of samples was subjected to DNA 
sequencing in order to correlate the sizes of the PCR 



fragments with particular tract lengths. Tract lengths 
were then converted into an 'ON' or 'OFF' phenotype 
based on an analysis of the reading frames for each 
gene contained in the strain NCTC11168 genome 
sequence. This expression data was then converted 
into a binary format using a T for an ON phenotype 
and a '0' for an OFF phenotype. A six-digit binary code 
was then derived for each colony based on the expres- 
sion status of the six genes analysed. 

Chicken colonization experiments 

Two-week-old out bred commercial broiler chickens (PD 
Hook, UK) were given a 0.1ml dose of Campylobacter- 
free gut flora prepared as previously described on the day 
of hatching (24). Campylobacter strains were grown over- 
night on sheep blood agar (Oxoid). A sweep from these 
plates was used to inoculate an MHB culture and grown 
for 24 h at 37°C. An inoculum was prepared from this 
culture and inoculated into 2-week-old chickens via the 
oral route. Dilutions of the inoculum were plated onto 
Campylobacter selective blood-free agar plates (Oxoid) 
and incubated for 3 days. Colony counts, immunoblots 
and boiled lysates of 30-60 colonies were prepared from 
these plates. The remaining cells in the inoculum culture 
were concentrated by centrifugation and utilized for prep- 
aration of a DNA lysate. Chickens were culled at 1 and/or 
14 days post inoculation and caecal material was collected 
from both caeca. Caecal material was resuspended in PBS 
and subjected to vigorous mixing using a vortex prior to 
plating of dilutions on Campylobacter selective blood-free 
agar plates (24). After incubation, the colonies were sub- 
jected to an identical analysis as for the inoculum. For the 
experiment with strain NCTC11168H, DNA was ex- 
tracted from the caecal material using a QiaAmp Stool 
kit (Qiagen). For strain 81-176, the inoculum contained 
large and small colonies that exhibited identical genotypes 
for the phase variable genes. 

Stochastic model 

The model was derived assuming a sufficiently large bac- 
terial population with time measured in generations. The 
model also assumes that switching of each gene is inde- 
pendent of all the other genes. Each bacterium has d genes 
that can be in an OFF or ON state coded as 0 and 1, 
respectively, such that each bacterium is represented by 
the random vector £ = £ 2 , . . . , §dX where £i can take 
only two values 0 or 1. The random vector £ has 2 d 
possible values from £2 = {Aj = (a!, a 2 , . . . , ad) with 
flj=0,l}, where we label each element Aj of £2 by a 
number i from 1 to 2 d in the increasing order 
A! = (0,0,...,0),..., A 2 d = (l,l,...,l). Consider a 
parent bacterium at time n = 0: x(0) e £2. At time n = 1 
(i.e. after the first division), the parent bacterium 
produces two offspring: f(l;l; x(0)) and £(1;2; x(0) from 
Q. We assume that £(1;1; x(0)) and f(l;2; x(0)) are condi- 
tionally [conditioned on x(0)] independent random vectors 
and we introduce the transitional probabilities: 

Pij = Prob(£(l; l;x(0)) = Aj|x(0) = A), (1) 



Nucleic Acids Research, 2012, Vol. 40, No. 13 5879 



from which we form the 2 d x 2 d -matrix of transitional 
probabilities 

P = (Pit)- (2) 

We assume that all p t j > 0 and that on/off switching 
of genes happens independently, i.e. if we introduce 
2d transitional probabilities: 

Pi = Prob(£ = l|Xi(0) = 0) and ^ = Prob(£i = 0|Xi(0)= 1), 

(3) 

then 



PlJ — 11 m =l P m y 1 Pm) 



(4) 



where a (i,j;m;l, k)= 1 if Aj has the m-th component 
equal to / and Aj has the m-th component equal to k, 
otherwise a(i,j;m; I, k) = 0. We also assume that the tran- 
sition matrix P does not change with time. 

We continue the dynamics so that at time n = 2 the 
bacteria §(1;1; x(0)) and £(1;2; x(0)) produce their four 
offspring and so on. This way we obtain a branching 
tree as a result of binary fission. Denote by Z k (n\ x 0 ), a 
number of bacteria of type A k in the population after n 
divisions starting from the bacterium with x 0 = x(0) e £2. It 
obviously depends on a realization co of the branching tree 
and its more detailed notation is Z k (n\ x 0 ) (co). The collec- 
tion Z(n| x 0 ) (co)={Z k (n\ x 0 )(co), k = 7,2,..., 2 d } gives us a 
population living on the set {7,2, ...,2 d } and 
2~2k=i ■Z/<r(«l- x 'o)( &) ) = 2". Now, we randomly (i.e. each 
time independently) draw a member (i.e. a bacterium) 
from this population and ask the question with what prob- 
ability its type is A k . For a fixed co (i.e. for a particular 
realization of the branching tree), the probability to pick a 
bacterium of the type A k is equal to p^(n\x 0 )(co) = ^ Z k (n\ 
x 0 )(co). This is a random distribution analogous to the 
random measure appearing in Wright-Fisher-type 
models [see e.g. (28)]. Here we consider an average of 
the distribution p k (n\x 0 )(a)). Assume that we can put 
together all possible realizations of the branching trees 
Zin | x 0 )(&>i), Z(n\ x 0 )(co 2 ), . ■ . occurring during binary 
fission after n divisions then the proportion of bacteria 
of the type A k in this total collection is equal to 



n k (n\x 0 ) = J2j=i ^Prob[Z k («|x 0 )(ft>) =/] 



1 



(5) 



EZ k (n\x 0 ) = Ep k (n\x 0 ). 



The meaning of the average jr k («|x 0 ) is as follows. If we 
take all possible binary trees (this is where we use the near 
infinite population size assumption) started from x 0 and 
put their results after n-divisions together then this 
average gives us the proportion of bacteria of the type 
A k Obviously, 7t(n\x 0 )=[jTi(n\x 0 ), 7t 2 (n\x 0 ), . . . , n 2 (n\x 0 )] 
is a probability measure defined on the set {1, . . . ,2 d }. It 
is not difficult to compute jr k («|.\' 0 ) using the earlier as- 
sumptions on transition probabilities (l)-(2) and obtain 



where the vector :r(0) is the initial distribution of xq = A m 
for some m, i.e. it is the vector for which all components 
equal zero except an m-th component which equals 1. 
Instead of starting with a single bacterium, we can start 
with an initial population having distribution ji(0). The 
Equation (6) remains valid but it should be kept in mind 
that in this case we average in Equation (5) both over the 
initial distribution and over all possible trees growing 
from each draw from 7r(0). Furthermore, in the above 
consideration we assumed that all offspring survive and 
the population size grows exponentially. Equation (6) 
remains valid when the number of bacteria of each type 
A k dying at time t = n is proportional to 7r k (n|x 0 ) under the 
condition that the population remains of a sufficiently 
large size N. The same modelling approach was used, 
e.g. in (29), though resulting in a different model within 
a different biological set-up. The considered model takes 
into account mutation drift only and does not include se- 
lection or bottlenecks. 

Stationary distribution 

Under our assumption that p-^ > 0 the distribution ir(n\ 
7r(0)) has the unique limit n s when n tends to infinity. 
The stationary distribution jt s is independent of the 
initial distribution jt(0) and is the eigenvector of the tran- 
sition matrix P corresponding to the unit eigenvalue: 
tt s = 7r s P. Using (3)-(4), we obtain that the components 
ttJ of jt s are equal to jrj = Wj =l ^ m (\ - i, ; )" { '- r ' \ where 
<x{i,j;l) = 1 if the A; type has the j-th gene in the state / 
and ci(ij;l) = 0 otherwise, anduj = qj/ipj+qj), j=l,...,d. 
The speed of convergence of 7t(n\ Jt(0)) to jt s depends 
on the transition matrix P and on the initial distribution 
7r(0). The number of time steps (i.e. generations) n s 
required for jt(n\ 7r(0)) to reach a proximity of jt s can be 
estimated via the maximum X of the non-unit eigenvalues 

j— <7j of the matrices Pj — 



1 



formed 



7r(«|jr(0)) = jr(0)P n , 



(6) 



for each gene j from the transition matrix P. Roughly, 

n s x ln[s/||7r s -7r(0)||]/ln k, where ||^r s -7r(0)|| is the (e.g. 
total variation) distance between the initial distribution 
7r(0) and the stationary distribution 7r s and s is the desirable 
closeness of 7r(n s | jt(0)) and jt s . The speed of convergence to 
the stationary distribution is limited by the gene with the 
lowest switching rate. 



RESULTS 

Conservation and tract lengths of polyG/polyC repeats 
in C. jejuni genomes 

Campylobacter jejuni genomes contain multiple polyG or 
polyC mononucleotide repeat tracts. Eight genomes were 
analysed to ascertain the lengths and conservation of these 
tracts. The genomes contained 11-29 tracts of seven or 
more repeat units. The modal repeat number for the 
polyG/polyC tracts in seven of these genomes was nine 
and in the other was 10 with 64-95% of tracts containing 
9 or 10 repeats (Figure 1). This pattern contrasts with the 
low GC content of these genomes (31%) and a dearth of 
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G5, G6, G7 and G8 tracts (303, 18, 0 and 4 loci in strain 
NCTC11168, respectively). The four complete genomes 
(from strains NCTC11168, 81-176, 1221 and 81116) 
were analysed to ascertain conservation of these tracts. 
A total of 55 repeat-associated loci were detected 
in these four genomes with the majority (80%) of 
these tracts being polyG repeats present within the 
reading frames and on the coding strand of genes 
(Supplementary Table SI). Only five loci were present 
and contained repeat tracts in all four genomes with 
only three of these tracts being within genes 
(Supplementary Figure SI). Overall, C. jejuni exhibits a 
high number of phase variable genes with a bias towards 
G9/G10 tracts but paradoxically only weak conservation 
of the PV mechanism in specific loci. 

Phase variable genes of C. jejuni have high switching rates 

The cap A gene, encoding a 120-kDa autotransporter, of 
C. jejuni strain NCTC11168 contains a Gil repeat tract 
within the reading frame, which is preceded by a polyT 




Repeat Tract Length 

Figure 1. Lengths of polyC/G repeat tracts in C. jejuni genomes. 
Published genome sequences (z-axis) were scanned for simple 
sequence repeat tracts containing seven or more C or G residues. The 
x-axis indicates the lengths of the tracts; the y-axis indicates the number 
of loci containing tracts of a particular length. 



tract of six residues depending on the source of the strain 
[Ashgar et al. (19) report this gene as having T5-G10 
tracts]. Both directions of switching of this receptor were 
measured by colony immunoblotting with a CapA-specific 
polyclonal antiserum. The ON variants contained a Gil 
tract while OFF variants contained either G10 or G12 
tracts. ON-to-OFF PV occurred at a rate of 1.6 x 10~ 3 
mutations/division while OFF-to-ON PV rates were 
1.8-fold higher for G12 variants or 3.3-fold lower for 
G10 variants (Table 1). 

As antisera were not available for other phase variable 
epitopes or gene products, reporter constructs were 
created to analyse the PV rates of two other C. jejuni 
genes, cjl 139 and cj0031. In both cases a lacZ gene 
lacking an initiation codon was fused in frame to the 
gene, such that changes in repeat number would alter 
P-galactosidase expression (Figure 2). The cjl 139-lacZ 
reporter construct contains a G8 tract, which was 
altered by site-directed mutagenesis to a Gil tract 
(Figure 2). The cj0031-lacZ reporter construct contains a 
G9 tract at the 3'-end resulting in fusion of cj0031 with 
cj0032 upon changes in length of this tract. These con- 
structs were recombined into the C. jejuni chromosome 
resulting in detection of high-level expression of 
P— galactosidase and phase variable changes in expression 
when these constructs were plated on MHA plates con- 
taining X-Gal (Supplementary Figure S3). 

PV rates were measured for these constructs for both 
directions of switching. High PV rates were detected for 
all constructs with a trend for rates to increase as 
a function of tract length from 1 x 10~ 4 mutations/ 
division with 11168-cjll39-G7-cat to 4.1 x 10" 3 with 
11168-cjll39-Gll-cat (Table 1). The ON-to-OFF PV 
rates for the isogenic 11168-cjll39-G8-cat and 11168- 
cjl 139-G1 1-cat constructs exhibited a significant differ- 
ence (P < 0.0001; using a Mann-Whitney non-parametric 
rank sum test, InStat 2.0), with the 9.5-fold higher rate for 
the Gl 1 construct demonstrating that tract length exerts a 
major effect on PV rates in this species. The observation of 
slightly higher PV rates for the 11168-cjll39-G8-kan 
as compared to 1 1 168-cjl 139-G8-cat suggests that PV 
rates are influenced by the context of the reporter. 
However, comparable rates were observed for the 



Table 1. Phase variation rates of C. jejuni genes 



Gene (reporter) Direction of Switching 



on-to-off off-to-on 





Tract 3 (rc) 


Freq. b (xl(T 3 ) 


Rate c (x 10~ 4 ) 


Tract (n) 


Freq. (xl0~ 3 ) 


Rate (xl0~ 4 ) 


cjl 139 (lacZ-cat) 


G8 (36) 


3.83 


4.23 {1.0} (3.0-5.7) 


G9 (24) 


1.81 


2.15 {1.0} (1.4-2.8) 




Gil (13) 


49.45 


40.54 {9.5} (35.3-64.6) 


G10 (17) 


3.39 


3.67 {1.7} (3.2^1.8) 


cjll39 (lacZ-kan) 


G8 (22) 


10.08 


10.44 {2.5} (6.9-16.6) 


G7 (7) 


0.65 


1.00 {0.5} (2.8-0.5) 


cj0031 (lacZ-cat) 


G9 (15) 


13.52 


12.30 {2.9} (9.1-22.2) 


G10 (22) 


17.82 


17.88 {8.3} (11.0^10.2) 


capA (antibody) 


Gil (14) 


18.44 


16.41 {3.9} (11.2-30.9) 


G12 (23) 


32.59 


29.49 {13.7} (22.1-43.0) 










G10 (11) 


4.29 


4.88 {2.3} (2.5-8.3) 



"Numbers in brackets indicate number of colonies examined 
b Median frequency 

C PV rates were estimated according to (42), numbers in curly brackets are fold increase over shortest tract, numbers in square brackets are 95% CIs 
calculated according to (43). Statistical tests of differences in PV rates are provided in the Supplementary Data. 
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(a) cjii40 
(cstlll) 



cjll39c 

(wloN) cjll38 



Table 2. Mutational patterns of the phase variable genes of C. jejuni 



G8 



(b) 



lacZ 
(no ATG) 



J III 



G8 



(C) 



J L 



Gil 

Figure 2. Schematic representation of the reporter constructs for 
cjll39c. The upper diagram, (a), represents the wild-type locus in C. 
jejuni strain NCTC11168. The ORFs are represented by shaded rect- 
angles and the direction of transcription by dotted lines. The repeat 
tract in cj 1139c is indicated by a series of small, white rectangles. The 
middle diagram, (b), represents the reporter construct. In this construct, 
a lacZ gene lacking a promoter and initiation codon was fused to 
cjll39c downstream of the repeat tract. A chloramphenicol cassette 
(cat) was inserted at the 3'-end of the lacZ gene and utilized as a 
selective marker during insertion of this construct into the chromosome 
of C. jejuni strain NCTC11168. The lower diagram, (c), shows the 
reporter construct containing a Gil repeat tract. This construct was 
derived from the G8 construct by site-directed mutagenesis of the 
repeat tract prior to recombination into the chromosome. 



reporter constructs as compared to immunoblotting 
(e.g. PV rates for cjl 139-G1 1-cat were 2.7-fold higher 
than for capA-Gll but similar for cjll39-G10-cat and 
capA-GlO), indicating that the reporter constructs 
provide an accurate measure of switching rates. These 
results showed that C. jejuni PV rates are high and 
increase as a function of tract length. 

Mutational patterns of C. jejuni genes vary as a function 
of tract length 

The mutational events responsible for changes in expres- 
sion in phase variants were examined by either sequencing 
or sizing of PCR products spanning the repeat tract. 
All PV events involved changes by a single repeat unit 
(Table 2). This means that the ON-to-OFF PV rates are 
measurements of a combination of insertions and dele- 
tions of single nucleotides while the OFF-to-ON PV 
rates measure either insertions (+1) or deletions (—1) of 
one repeat. Thus the OFF-to-ON PV rates for 
11168-cjll39-G9 and 11168-cj0031-G10 provide measure- 
ments of —1 deletion rates while 111 68-cj 1 1 39-G7 and 
111 68-cj 1 1 39-G10 are estimates of +1 rates. In the case 
of ON-to-OFF rates of the reporter constructs, we 
observed a preponderance of insertions over deletions 
for G8 and G9 tracts of 11:1 (1 1 1 68-cj 11 39-G8-cat), 24:1 
(11168-cjll39-G8-kan) and 6:1 (11168-cj0031-G9) but an 
opposing prevalence of deletions over insertions for Gl 1 
tracts of 23:1 (1 1 168-cjl 139-G1 1) or 2:1 (11168-capA- 
Gll). Using the data for the lacZ reporter constructs, 
mutation rates (reported as mutations/division x 10~ 4 ) 
were estimated for a range of —1 repeat deletions (G8 to 
G7, G9 to G8, G10 to G9 and Gil to G10 were 0.4, 2.1, 
17.9 and 38.8 respectively) and +1 repeat insertions (G7 to 
G8, G8 to G9, G9 to G10, G10 to Gil and Gil to G12 
were 1.0, 6.9, 10.3, 3.7 and 1.8 respectively). A major 
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difference was observed between these mutational events 
with a tract length dependent increase in the frequency of 
— 1 deletions for G8 to Gil tracts whereas +1 insertional 
events exhibited a peak at a tract length of G9 and then 
decreased in frequency in longer tracts. 

Minor changes in the genotypes of the phase variable 
genes of C. jejuni occur during in vitro passage 

The high PV rates of C. jejuni genes suggested that 
changes in the proportions of variants for individual 
genes and for combinations of genes may occur rapidly 
during growth of populations of this bacterial species even 
in the absence of selection. To investigate the capacity 
for change, passage experiments were performed 
with a laboratory-adapted variant of C. jejuni strain 
NCTC11168. Six phase variable genes were examined, 
which included the three genes with determined PV rates 
(Table 1), a range of tract lengths (G8-G11), alternate 
positions of repeat tracts in the genes (i.e. start, middle 
and end; see Supplementary Table SI) and an alternate 
transcriptional orientation (C8, cj0685). An initial exam- 
ination of the numbers of variants in single colonies grown 
on MHA plates only detected a low level of variants in 
agreement with the observed switching rates 
(Supplementary Table SI1). This strain was then subjected 
to three rounds of growth in MHB using either a constant 
or variable initial inoculum (Figure 3 and Supplementary 
Table SIII). The tract lengths were determined for 
multiple colonies from the input and the eleven output 
populations using a multiplex PCR and a GeneScan 
assay (Supplementary Table SIV). The tract lengths, 
which were all present within the reading frames of the 
associated genes, were then converted into ON/OFF 
phenotypes (Supplementary Table SV). A significant 
change was observed only in the proportions of ON and 
OFF variants for cap A, which went from 27% OFF in the 
input to an average of 56% OFF (±0.1) in the output 
populations. Combined genotypes for all six genes were 
derived for each colony and utilized for examination of 
changes in the complexity of the populations (Figure 3). A 
total of 18 genotypes were detected from a potential of 64 
genotypes. The output populations were similar to each 
other and only differed markedly from the inoculum due 
to a decrease in 0-0-1-1-0-1 from 47% to 31% (± 11.9) 
and an increase in 0-0-1-1-0-0 from 10% to 41% (±8.4), 
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Figure 3. Changes in the proportions of each genotype following in vitro passage of C. jejuni strain NCTC11168. An initial inoculum of either 
constant (Ci; 3.5 x 10~ 8 cfu) or variable (Vi; 3.5 x 10~ 8 to 3.5 x 10~ cfu) size was subjected to three passages in 5 ml of MHB (see Supplementary Data for 
further details). Genotypes were derived from 30 colonies for the inoculum and each ouput sample by assigning a T for an 'on' or '0' an 'off phenotype 
to each of six genes (cjl326, cjOOSl, cjll39, cj0685, cj()045 and capA, respectively) based on the numbers of repeats present within the gene. Inoc, 
inoculum; Vi-1 to Vi-5, variable inoculum samples; Ci-1 to Ci-6, constant inoculum samples. 



which correlated with an ON-to-OFF switch in the cap A 
gene. The output populations for the variable inoculum 
samples were similar indicating that an initial inoculum of 
>10 cfu had not influenced population structure. In 
summary, short-term in vitro passage produced only 
limited genotypic variation in these six genes indicating 
that C. jejuni phase variable genotypes are relatively 
stable when passaged in a non-changing environment 
using large input populations. 

Varying patterns of changes in the genotypes of the phase 
variable genes of C. jejuni occur during in vivo passage 

Campylobacter jejuni is a commensal of chickens and 
colonization leads to high and persistent numbers of bac- 
terial cells in the caecum. To explore the extent of PV 
occurring during colonization of this natural host, a 
group of 10 2-week-old chickens were inoculated with a 
high dose (1 x 10 8 cfu) of a hypermotile variant of C. jejuni 
strain NCTC11168. Five birds were sacrificed after 1 day 
and examined for the presence of C. jejuni in caecal 
contents. No growth was detected indicating a low level 
of initial colonization by this strain. Caecal contents from 
the other five birds were examined 2 weeks after inocula- 
tion and high levels of C. jejuni cells were detected 
(ranging from 1.6 x 10 7 to 4 x 10 8 cfu/g caecal contents). 
A DNA isolation procedure was performed on caecal 
contents and C. jejuni genes were readily amplified from 
these extracts. Repeat tracts lengths were determined for 
six genes for 30 colonies from the inoculum and each bird 
and for the DNA extracts. Comparisons were performed 
between the major repeat tract length detected in DNA 
extracts and the genotype present in the majority of 
colonies (Supplementary Tables SVI and SVII). For four 
of the genes (cj0031, cj0685, cjl!39 and cjl326) the lengths 
were identical. Differences were detected for cap A in Bll 



and for cj0045 in the inoculum, B8 and B9 samples but in 
most of these cases; the ratios between major and minor 
peaks identified by GeneScan were low indicating that 
a mixture of variants was present in the sample 
(Supplementary Table SVI). These results provided 
evidence that the growth (~20 generations) required to 
generate the colonies had not generated large variations 
in the ON/OFF status of the phase variable genes such 
that analysis of multiple colonies provides an accurate re- 
flection of in vivo genotypes. 

Analysis of the genotypes in the input and output 
samples for the infections with C. jejuni strain 
NCTC11168H detected significant changes from the 
inoculum in the proportions of variants for three genes 
(Supplementary Table SVIII):- cj0031, switched from 
OFF (G10) to ON (G9); cj!139 switched from ON (G8) 
to OFF (G9); and cj0685 switched from OFF (G8) to ON 
(G9) (/"-values of <0.0001 were obtained in a Chi-squared 
test for each bird for each gene as compared to the 
inoculum using InStat 1.0). There were also bird-to-bird 
variations in the ON/OFF variants for cj0045 in birds B8 
and B9 (P-values of <0.001 were obtained for compari- 
sons with data from B6 and Bll) and for cap A in birds B7 
and B8 (P-values of <0.01 were obtained for B7 versus 
Bll, B8 versus B9 and B8 versus Bll and 0.04 for B7 
versus B9). These differences were reflected in major dif- 
ferences in genotype distributions between the inoculum 
and output populations (Figure 4). A total of 22 genotypes 
were detected. Nine genotypes were present in the 
inoculum but only three of these genotypes were 
detected in output populations and in each case in only 
one bird. The bird-to-bird variations were mostly due to 
differences in the levels of two variants — 0-1-0-1-0-0 and 
0-1-0-1-0-1 — which exhibit opposing ON/OFF pheno- 
types for capA. The major shift in the genotypes 
detected during this in vivo passage experiment highlights 
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Figure 4. Changes in the proportions of genotypes following in vivo passage of a hypermotile variant of C. jejuni strain NCTC11168. Two-week-old 
out-bred chickens were inoculated with 1 x 10 cfu. Caecal samples were collected 2 weeks after inoculation and C. jejuni was enumerated by growth 
of dilutions on selective plates. The genotypes for phase variable genes were derived by the same approach as in Figure 3 for the same six genes. 
Genotypes were derived for 30 colonies from the inoculum and for 23-29 colonies from output populations following growth under identical 
conditions. Inoc, inoculum; B6-B11, individual birds. 



the significant potential for infection to alter population 
structure of the C. jejuni phase variable genes. 

As prior adaptation of a strain to replication in chickens 
may result in selection of specific ON/OFF phases and in 
order to examine the patterns of PV in other C. jejuni 
strains, a second experiment was performed with a 
chicken-adapted variant of a widely-utilized C. jejuni 
strain, 81-176. Output populations were again generated 
2 weeks after initial inoculation. Four of the genes 
examined in strain 11 168 were absent or non-phase 
variable in this strain but homologues of cj0045 
(cj8 1176-0083) and cj0685 (cj81 176-0708) were present 
although cj8 1176-0083 is a pseudogene. These genes plus 
four other genes were examined covering C9, G9 and G10 
tracts in the start, middle and end of the phase variable 
genes of strain 81-176. Significant differences were only 
detected for gene 81176-0083 between the outputs from 
birds B09 and BOM in comparison to the inoculum and 
the outputs from birds B03 and B06 (Supplementary 
Table SX and XI). In this experiment, 23 genotypes 
were observed (using Gil as an arbitrary ON repeat 
number for gene 81176-0083) with the major difference 
being variation in the levels of 1-1-1-0-1-1 and 0-1-1- 
0-1-1, which differ in the repeat tract lengths of 
81176-0083 (Figure 5). This experiment highlighted the 
potential for bird-to-bird variation in the tract lengths of 
phase variable genes but otherwise was indicative of the 
stability of these tracts over a 2-week period of persistence 
in broiler chickens. 

Modelling of the impact of mutational drift on the 
genotypic diversity of a phase variable population 

The proportions of phase variants within a population can 
change due to a combination of mutational drift, 



population bottlenecks and selection. The high PV rates 
of the C. jejuni genes suggested that changes in the pro- 
portions of genotypes could have occurred solely due to 
mutational drift. A theoretical model (see its description 
below in Section 'Stochastic model') was developed to 
examine the impact of mutational drift on population 
structure. A major assumption of this stochastic model 
was that each gene was switching independently of all 
the other genes. This assumption was tested by generating 
a theoretical distribution of genotypes from the measured 
number of ON and OFF variants for each gene and per- 
forming a comparison to the observed distribution derived 
from analysis of each colony. For most (>80%) of the 
major genotypes in both the in vitro and in vivo passage 
experiments, the proportions of observed genotypes were 
within the confidence intervals calculated for the theoret- 
ical distribution (Supplementary data and Supplementary 
Table SIX). In contrast, proportions for the minor geno- 
types observed in 1-2 colonies were outside these error 
bars. In general these results indicated that the assumption 
was valid and could be used to evaluate the impact of 
mutational drift on changes in the populations. 

The inputs to the model were the observed PV rates, the 
initial distribution and the number of generations. Each 
gene was assumed to switch at the rates determined using 
the reporter constructs and to oscillate between two tract 
lengths indicative of an ON and OFF state e.g. G8 (ON) 
and G9 (OFF) for cjll39. For the in vitro passage assay, 
the number of generations was estimated as 1 5-30 depend- 
ing on inoculum size and viability of the cells after over- 
night growth. The model predicted the rapid generation 
of minor genotypes, some of which were observed in 
output populations (e.g. 0-1-1-1-0-0, 1-0-1-0-0-1, 
Supplementary Figure S2B), and that equivalence of the 
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Figure 5. Changes in the proportions of genotypes following in vivo passage of a chicken-adapted variant of C. jejuni strain 81-176. The genotypes 
were derived by the same approach as in Figure 4 for the following six genes: 81176-0083. 81176-0646, 81176-0708, 81176-1160, 81176-1312 and 
81 176-1325. Genotypes for 29 small colonies (Inoc-S) and 29 large colonies of the inoculum were derived following growth on Campylobacter 
selective plates. The genotypes for output populations were derived for 26-30 colonies from dilutions of caecal samples grown under identical 
conditions. Caecal samples were collected 2 weeks after inoculation of chickens with 1 x 10 8 cfu. Inoc-S, small colonies in inoculum; Inoc-L, large 
colonies in inoculum; B03. B06, B09 and B014, individual birds. 



0-0-1-1-0-0 and 0-0-1-1-0-1 genotypes would be 
reached after 100 generations at a frequency of ~0.2 
(Supplementary Figure S2A). The genotype distributions 
from individual output populations were not significantly 
different from the inoculum distribution but significance 
(P = 0.01) was observed in a comparison between 
inoculum and the average output of all 11 cultures 
[using a one-sided Kolmogorov-Smirnov (KS) test]. 
Using the inoculum population as an input, theoretical 
output populations diverged from observed outputs as 
the model was run for 100 or more generations. A 
switch to a high prevalence (frequencies of >0.3) of the 
0-0-1-1-0-0 genotype after 20 generations could only be 
achieved in the model by increasing the capA ON-to-OFF 
and cj0685 OFF-to-ON rates by 10-fold resulting in con- 
vergence of the model and average output population to a 
non-significant difference. Overall these analyses indicated 
that the model was providing a reliable measure of the 
changes in the experimental populations but also 
suggests that a low level of selection is acting on PV of 
capA and cj0685 during in vitro passage. 

The in vivo populations were modelled for up to 5000 
generations as the actual number of generations can only 
be roughly estimated. For colonization experiments with 
strain NCTC11168 (Figure 6), the model predicted the 
generation of some novel genotypes (e.g. 1-1-0-1-0-0 
reached 19% by 5000 generations) but only very low 
levels of the genotypes actually observed in output popu- 
lations from chickens (e.g. 0-1-0-1-0-0, 0-1-0-1-0-1 and 
0-1-0-1-1-1 were present at 11%, 1% and 0.1%, 



respectively, in model outputs after 5000 generations but 
at an average of 34%, 20% and 18%, respectively, in ex- 
perimental outputs). These distributions were significantly 
different from the inoculum for each individual output 
population or using an average of all the outputs from 
all the birds (P = 0.01 using the KS test). A non-significant 
difference between model and experimental data could 
only be attained by significantly altering the PV rates of 
multiple genes (data not shown), indicating that the dif- 
ferences between these distributions were not solely due to 
inaccuracies in the switching rates. These results indicated 
that the genotype profile observed in vivo with strain 
11 168 was not due to the mutational drift associated 
with the high PV rates. 

For the experiments with strain 81-176 (Figure 7), the 
model predicted that, the two major input genotypes, 0-1- 

I- 0-1-1 and 1-1-1-0-1-1, would be reduced to 8% and 
4%, respectively, after 500 generations and to <5% after 
2500 generations whereas these genotypes were present at 

II- 76% and 3-61% in the different output populations. 
Low levels of novel genotypes are generated after 2500 
generations at varying levels but few of these were seen 
in output populations and there was little overlap between 
predicted and observed genotype frequencies (Figure 7). 
Statistical analyses of these distributions using the KS test 
detected significant differences between inoculum and 
outputs from birds B03 and B06 but not from birds B09 
and BOM. Observed and theoretical output populations 
exhibited significant divergence after 100 generations 
(,P = 0.01) but convergence in the case of B09 to 
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Figure 6. Comparison of the changes in proportions of phase variable genotypes for theoretical and experimental in vivo passaged populations of 
strain NCTC11168. The proportions of genotypes for the inoculum were used as input to the theoretical model of PV, which was then run for a 
varying number of generations. Genotypes were for the six genes with 0 representing an OFF phase variant and 1 an ON variant. The order of the 
genes and the ON-to-OFF and OFF-to-ON switching rates (xl(T 4 ) were as follows: cjl326, 10.3, 17.9; c/0031, 10.3, 17.9; c/1139, 6.9, 2.1; cj0685, 2.1, 
6.9; cj0045, 38.8, 3.7; capA, 38.8, 3.7. Inoc, inoculum; Ouput-100G,-500G,-2500G and-5000G, are output data from the model for runs of 100, 500, 
2500 and 5000 generations; B6, B7, B8, B9 and Bll are experimental data for output populations obtained from five different chickens. 



non-significance by 500 generations. Again, the divergence 
between the predictions of the model and observed popu- 
lations suggests that mutational drift is not responsible for 
observed in vivo genotype profiles. 

The high switching rates of C. jejuni phase variable 
genes predict the rapid appearance of a steady state for 
the frequencies of genotypes. The time required and 
nature of this steady state was predicted using the 
model. With six genes switching at the highest rate (i.e. 
0.004) and starting with all genes in an OFF state, then 
744 generations were required to reach the steady state. A 
10-fold reduction in one or both directions of switching 
for one gene increased the time to 1400 and 7500 gener- 
ations, respectively. For the actual populations described 
in this study, the steady state was reached in 6600, 6600 
and 2000 generations for the in vitro, in vivo strain 
NCTC1 1 168 and in vivo strain 81-176 assays, respectively. 
Comparisons also indicated that the inoculum popula- 
tions for these experiments were not already at the 
steady-state (Supplementary Figure S2, Figures 6 and 7). 
The larger numbers for the former two cases were due to 
two of the genes, cjll39 and cj0685, exhibiting low 
ON-to-OFF switching due to a G8 to G9 insertion. 
Critically, the steady state population contains small 
numbers of many genotypes rather than a bias to high 
frequencies of a few genotypes contrasting with the 
actual distributions observed in the in vivo output popu- 
lations (Figures 6 and 7). 



DISCUSSION 

Several important bacterial pathogens utilize SSR- 
mediated PV to modulate expression of surface molecules 
and to alter interactions with their hosts (1^1). 
Campylobacter jejuni, a food-borne pathogen responsible 
for a large health and economic burden world-wide, 
contains multiple phase variable genes which exhibit 
ON/OFF switches in expression due to mutations in 
mononucleotide repeats of G or C nucleotides located 
within the reading frames (6). Using chromosomally- 
located reporter constructs in two phase variable genes, 
ON-to-OFF PV rates were measured for G8, G9 and 
Gil tracts and found to have high rates of 4.2 x 10~ 4 , 
1.2 x 10~ 3 and 4.1 x 10~ 3 mutations/division, respectively. 
Measurement of the PV rate by colony immunoblotting 
for the Gil tract of the native cap A gene detected an 
ON-to-OFF switching rate of 1.6 x 10~ 3 mutations/ 
division. The high switching rates of the C. jejuni genes 
are similar to the PV rates mediated by tetranucleotide 
repeats in H. influenzae, which ranged from 1.4 to 
5.6 x 10~ 4 mutations/division for tracts of 17-38 repeats. 
The C. jejuni PV rates are, however, significantly higher 
than detected for polyG tracts in N. meningitidis but 
similar to PV rates detected in MMR mutants of this 
species (PV frequencies for meningococci rose from 1 to 
3 x 10~ 5 in wild-type strains to 2-8.3 x 10~ 3 for a mutS 
mutant for G10 and G12 tracts whereas in C. jejuni a Gil 
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Figure 7. Comparison of the changes in proportions of phase variable genotypes for theoretical and experimental in vivo passaged populations of 
strain 81-176. The proportions of genotypes for the inoculum were used as input to the theoretical model of PV, which was then run for a varying 
number of generations. Genotypes were for the six genes with 0 representing an OFF phase variant and 1 an ON variant. The order of the genes and 
the ON-to-OFF and OFF-to-ON switching rates (xl(T 4 ) were as follows: 81176-0083, 38.8, 3.7; 81176-0646, 17.9, 10.3; 81176-0708, 10.3, 17.9; 
81176-1160, 17.9, 10.3 ; 81176-1312, 10.3, 17.9; 81176-1325, 10.3, 17.9. Inoc-S, small colonies in inoculum; Inoc-L, large colonies in inoculum; S-100, 
S-500 and S-2500 are output data from the model for runs of 100, 500 and 2500 generations using Inoc-S as the input to the model; B03, B06, B09 
and B0\4 are experimental data for output populations obtained from four different chickens. 



tract has a PV frequency of 1.8 x 10~ 2 or 5 x 10~ 2 for the 
capA and cjll39lacZ genes, respectively). While the 
absence of homologues of the canonical mutS and mutL 
MMR genes in C. jejuni may indicate the lack of a func- 
tional MMR system, the mutation rates due to point mu- 
tations as measured by generation of nalidixic acid or 
ciprofloxacin resistance are low (1 x 10~ 8 to 1 x 10~ 9 ) 
for most strains (30,31), suggesting the presence of 
systems for repair of mismatched base pairs. 
Contrastingly, the high PV rates in C. jejuni may arise 
because this species lacks a system for efficient repair of 
the insertion and deletion mutations associated with 
polyG/C repeat tracts. 

Tract length is a major determinant of the mutability of 
SSRs. Alteration of the G8 tract in cjll39 to Gil by 
site-directed mutagenesis increased the switching rate by 
10-fold. Tract length is, therefore, a major determinant of 
PV in C. jejuni as observed for PV in other bacterial 
species (5,9). Strikingly, a change in the pattern of 



mutations in the C. jejuni reporter constructs was 
observed from a bias towards insertions in G8 and G9 
tracts to deletions in G10 and Gil tracts. Biases in the 
correction of insertions and deletions have been detected 
in other systems (32-36) with an indication of a shift from 
correction of indels by the proof-reading subunits of DNA 
polymerase to MMR as mononucleotide tract length 
exceeds 7-8 nt (35,36). Thus the mutational spectra in 
the C. jejuni polyG tracts may reflect a transition from 
relatively efficient repair of —1 deletions in G7/G8 tracts 
to absence of repair in G10/G1 1 tracts and hence a shift in 
the mutational spectra. Alternatively, there may be 
another repair pathway active in this species. The most 
intriguing aspect is, however, that these mutational 
spectra are reflected in the prevalence of G9 and G10 
tracts in the phase variable genes of C. jejuni, suggesting 
that tract length is determined by molecular drivers rather 
than by selection for a particular switching rate. Repeat 
numbers for polyG tracts in meningococcal genomes tend 
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to be longer, probably due to selection for PV rate as 
these tracts have lower mutation rates in meningococci 
due to the presence of an active MMR system. 
Other Campylobacter species exhibit larger numbers of 
and longer polyG/polyC tracts than C. jejuni, e.g. 
C. upsaliensis contains 89 tracts of G7 or more of which 
59 contain 12 or more repeats (37). Our results suggest 
that divergence between the Campylobacter genomes in 
the polyG tracts may be driven by differences in the 
ability of the replicative machinery or repair systems to 
correct mutations in these tracts rather than selection for 
heightened PV rates. 

Multiple phase variable genes enable bacterial species to 
rapidly access a significant amount of genetic diversity 
(38). Six genes switching between ON and OFF results 
in 64 different genotypes while 27 genes will generate 
1.3 x 10 8 genotypes. However, the high PV rates 
detected for the C. jejuni genes means that populations 
of this bacterial species will rapidly reach a steady state 
combination of genotypes. Utilizing the measured PV 
rates for different tract lengths and a model of independ- 
ent switching of each locus, we estimate that C. jejuni 
populations will reach this steady state in 2000 to 7000 
generations for six genes depending on the tract lengths 
of these genes. These measurements assume that the en- 
vironments encountered in vivo do not induce a change in 
the PV rates. The rate of approach to this steady state is 
limited by the gene with the lowest switching rate. If we 
assume a replication time of 1 h and a population of suf- 
ficient size (>1 x 10 8 cfu), then this steady state will be 
reached in 12 to 42 weeks. C. jejuni can persist in the 
caeca of chickens at high levels for >12 weeks and so 
this steady state might be achieved in some hosts. Our 
observations indicate, however, that other factors might 
prevent C. jejuni populations from reaching this 
mutation-driven steady state. 

Two different patterns were observed in the in vivo ex- 
periments. In one case (Figure 4), we observed major dif- 
ferences between the input and output populations in the 
ON and OFF states of multiple genes. These differences 
could not be replicated by the model and are strongly 
suggestive of selection acting on some these loci 
(although we cannot dismiss the possibility of changes 
due to other process such as hitch-hiking with a 
mutation in another part of the genome). In the second 
case (Figure 5), we observed evidence of bird-to-bird vari- 
ation in the prevalence of two genotypes and absence of 
the minor genotypes predicted by the model to start ap- 
pearing between 100 and 500 generations. The absence of 
the minor genotypes may be due to the limited numbers of 
colonies examined but the bird-to-bird variation is sug- 
gestive of a random reduction in population size or a 
'bottleneck'. This is particularly likely because the gene, 
81176-0083, exhibiting most difference is a pseudogene. 
Both selection and bottlenecks will prevent C. jejuni popu- 
lations from reaching the mutational steady state with 
small bottlenecks resulting in constant re-setting of the 
genetic diversity of the population back to the major 
genotypes or indeed oscillation between major genotypes 
if the bottleneck is small enough. Bottlenecks may arise 
due to the daily excretion of the caecal material and 



re-colonization of the new contents by bacterial cells 
attached to the epithelium of the caeca or from other 
parts of the gastrointestinal tract. Additionally, there are 
major bottlenecks and selective pressures associated with 
transmission and initial colonization of birds. Similarly, 
an adaptive response is elicited during persistence of 
C. jejuni in birds (24,39) and these immune responses 
may impose significant levels of selection for variation in 
the ON/OFF states of phase variable genes. 

Finally, a feature of all the observations of the C. jejuni 
populations is for a significant level of genetic variation 
due to variations in the ON/OFF status of the phase 
variable genes. The functions of many of these genes are 
unknown or poorly characterized but this variation is 
likely to cause variations in host colonization and persist- 
ence between C. jejuni isolates of the same strain. The 
failure of signature-tagged mutagenesis screens and subse- 
quent evidence of variable colonization levels between 
isogenic WITS-tagged isolates may be due to differences 
in the expression status of phase variable genes (20). High 
levels of genetic variation are generated by the high PV 
rates of C. jejuni genes and should be of major concern 
during design of experiments with this bacterial species. 
This high level of PV-generated phenotypic variation may 
facilitate survival of this species during adaptation to a 
range of hosts and environments but may have most 
impact on survival of the diverse bacteriophage popula- 
tions known to infect Campylobacter and to attach to 
phase variable epitopes (40,41). 

SUPPLEMENTARY DATA 

Supplementary Data are available at NAR Online: 
Supplementary Tables I-XI, Supplementary Figures 1-3; 
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